low displacement rank
Learning Compressed Transforms with Low Displacement Rank
The low displacement rank (LDR) framework for structured matrices represents a matrix through two displacement operators and a low-rank residual. Existing use of LDR matrices in deep learning has applied fixed displacement operators encoding forms of shift invariance akin to convolutions. We introduce a rich class of LDR matrices with more general displacement operators, and explicitly learn over both the operators and the low-rank component.
Reviews: Learning Compressed Transforms with Low Displacement Rank
I find presented results interesting and valuable, however it is unclear how significant for the community they really are. Authors seem to view the approach as a way to decrease overparametrisation of the networks with minimal effect on accuracy. However, there are dozens of techniques which try to address this very issue, which are not really compared against. Instead, authors choose to focus on comparing to other, very similar approaches of reparametrising neural network layers with LDRs. Pros: - Clear message - Visible extension of previous work/results - Showing both empirically and theoretically how proposed method improves upon baselines - Providing implementation of the method, thus increasing reproducability Cons: - all experiments are relatively low-scale, and the only benefits over not constraining the structure is obtained for MNIST-noise (90 vs 93.5%) and CIFAR-10 (65 vs 66%) which are not very significant differences at this level of accuracy for these problems.
Learning Compressed Transforms with Low Displacement Rank
Thomas, Anna, Gu, Albert, Dao, Tri, Rudra, Atri, Ré, Christopher
The low displacement rank (LDR) framework for structured matrices represents a matrix through two displacement operators and a low-rank residual. Existing use of LDR matrices in deep learning has applied fixed displacement operators encoding forms of shift invariance akin to convolutions. We introduce a rich class of LDR matrices with more general displacement operators, and explicitly learn over both the operators and the low-rank component. We prove bounds on the VC dimension of multi-layer neural networks with structured weight matrices and show empirically that our compact parameterization can reduce the sample complexity of learning. When replacing weight layers in fully-connected, convolutional, and recurrent neural networks for image classification and language modeling tasks, our new classes exceed the accuracy of existing compression approaches, and on some tasks even outperform general unstructured layers while using more than 20x fewer parameters.